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Abstract —Identical twin face recognition is a challenging task 
due to the existence of a high degree of correlation in overall fa¬ 
cial appearance. Commercial face recognition systems exhibit poor 
performance in differentiating between identical twins under prac¬ 
tical conditions. In this paper, we study the usability of facial marks 
as biometric signatures to distinguish between identical twins. We 
propose a multiscale automatic facial mark detector based on a 
gradient-based operator known as the fast radial symmetry trans¬ 
form. The transform detects bright or dark regions with high ra¬ 
dial symmetry at different scales. Next, the detections are tracked 
across scales to determine the prominence of facial marks. Exten¬ 
sive experiments are performed both on manually annotated and 
on automatically detected facial marks to evaluate the usefulness of 
facial marks as biometric signatures. Experiment results are based 
on identical twin images acquired at the 2009 Twins Days Festival 
in Twinsburg, Ohio. The results of our analysis signify the useful¬ 
ness of the distribution of facial marks as a biometric signature. 
In addition, our results indicate the existence of some degree of 
correlation between geometric distribution of facial marks across 
identical twins. 

Index Terms —Face recognition, facial marks, identical twins. 


1. INTRODUCTION 

T he ability to distinguish between identieal twins based 
on different biometrie modalities sueh as faee, iris, fin¬ 
gerprint, ete., is a ehallenging and interesting problem in the 
biometrie area [l]-[5]. Identieal twins (also known as monozy- 
gotie twins) are formed when a zygote splits and forms two em¬ 
bryos. They eannot be diseriminated based on DNA. Therefore, 
other biometrie traits are needed to distinguish between iden¬ 
tieal twins. Using faee reeognition to differentiate between iden¬ 
tieal twins (monozygotie twins) is very diffieult [3], beeause of 
the high degree of similarity in their overall faeial appearanee. 
In this paper we foeus on distinguishing between monozygotie 
twins based on loealized faeial features known as faeial marks. 

Traditionally, biometries researeh has foeused primarily on 
developing robust eharaeterizations and systems to deal with 
ehallenges posed by variations in aequisition eonditions (sueh 
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Fig. 1. A pair of identical twins from the identical twins dataset. We observe 
a high degree of overall facial similarity and the difference in the number and 
type of facial marks. 

as pose, illumination condition, distance from sensor, etc.) and 
the presence of noise in the acquired data [6]. Only recently 
have researchers started to look at the challenges involved in 
dealing with the task of distinguishing between identical twins 
[1], [3]. Developing techniques and systems that improve twin 
face recognition should also improve generic face recognition 
systems. Although identical twins represent only 0.5% of the 
global population [2], failure to correctly identify each twin has 
led to problems for law enforcement agencies [1]. There have 
been several criminal cases in which either both or neither of 
the identical twins was convicted due to the difficulty in deter¬ 
mining the correct identity of the perpetrator [1]. 

In this paper, we propose to differentiate between identical 
twins using facial marks alone. Facial marks are considered to 
be unique and inherent characteristics of an individual. Fig. 1 
shows a pair of identical twins from the dataset. Although they 
are similar in appearance, they can be distinguished using facial 
marks. High-resolution images enable us to capture these finer 
details on the face [7]. Facial marks are defined as visible changes 
in the skin and they differ in texture, shape and color from the 
surrounding skin [8]. Facial marks appear at random positions 
of the face. By extracting different facial mark features we aim 
to differentiate between identical twins. We have defined eleven 
types of facial marks including moles, freckles, freckle groups, 
darkened skin, lightened skin, etc., for the analysis. 
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Fig. 2. Approach to distinguish between identical twins using manually detected facial marks. 



Fig. 3. Overview of the proposed multiscale automatic facial mark detection process. 


Initially, each image in the identical twin dataset is manually 
annotated by multiple observers to determine the different types 
of perceptible facial marks. Manually annotated facial marks are 
characterized both by location and category. The approach to dis¬ 
tinguish between monozygotic twins based on manually detected 
facial marks is shown in Fig. 2. Next, we propose a multiscale 
automatic facial mark detector based on the fast radial symmetry 
transform (FRST) [9]. The transform detects dark regions with 
high radial symmetry. An overview of the proposed multiscale 
automatic facial mark detector is shown in Fig. 3. Initially, an 
image is represented at multiple scales in the form of a Gaussian 
pyramid. An Active Shape Model (ASM) [ 10] is used to detect the 
contours of the primary facial features like eyes, lips, nostrils and 
eyebrows. Using the output of the ASM, a mask is created to re¬ 
move the primary facial features. Next, the FRST is applied to the 
masked image to detect dark regions with radial symmetry. The 
aforementioned steps are applied to all images in the Gaussian 
pyramid. Finally, the detections are tracked across scales. Au¬ 
tomatically detected facial marks are characterized only by geo¬ 
metric location. The locations of facial marks are converted from 
image pixel coordinates to barycentric coordinates [11] to facili¬ 
tate interimage comparison of landmark locations. The similarity 
in the distribution of facial marks is used to determine the sim¬ 
ilarity between two face images. The similarity is computed by 
formulating a bipartite graph matching problem. 


Extensive experiments are conducted on the manually and 
automatically detected facial marks. The data used for the in¬ 
vestigation was acquired at the Twins Days Festival at Twins- 
burg, Ohio in 2009 [3]. The dataset consists of 477 images cor¬ 
responding to 178 subjects and 89 pairs of twins. The presented 
results indicate the need for an automatic facial mark detector 
and demonstrate that facial marks can be used biometric signa¬ 
tures to distinguish between identical twins. Prior research has 
claimed that the number of facial marks between twins is similar 
but the distribution of facial marks across twins is different [12]. 
We also analyze this conjecture. Contrary to the commonly held 
belief, our results indicate nontrivial correlation between distri¬ 
butions of facial marks across identical twins. 

A preliminary version of this investigation was presented in 
[13]. In [13], we presented the use of facial marks as biometric 
signatures by analyzing facial marks annotated by multiple ob¬ 
servers. The dataset used in [13] consisted of only 295 twin face 
images and 76 pairs of twins. In this paper we have introduced 
a multiscale automatic facial mark detector based on the fast 
radial symmetry transform and evaluated its performance. We 
also compare performance across multiple annotation sessions 
and multiple observers. The different experiments are executed 
on a larger dataset. 

The paper is organized as follows. Section II discusses related 
work. Section III provides description of different categories 
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of facial marks. Section IV describes the identical twin dataset 
used in this investigation. The manual annotation process is de¬ 
scribed in Sections V and VII describes the proposed multi¬ 
scale automatic facial mark detector. Section VI describes the 
matching process. The details of the experimental setup and re¬ 
sults are presented in Section VIII. The paper concludes with a 
brief summary and discussion. 

II. Related Work 

We first discuss prior research related to faeial mark detee- 
tions and then foeus on identieal twin faee reeognition researeh. 

A. Related Work on Facial Mark Detections 

Lin et al [7] represented the faee at multiple layers in 
terms of global appearanee, faeial features, skin texture and 
irregularities that eontribute towards identifieation. Global 
appearanee and faeial features are modeled using a multilevel 
PCA (Prineipal eomponent analysis) followed by regularized 
LDA (Linear diseriminant analysis). A Seale Invariant Feature 
Transform (SIFT) is employed to deteet and deseribe details 
of irregular skin region, whieh is eombined with elastie graph 
matehing for reeognition. Improved performanee was aehieved 
by fusing faeial features at multiple levels. Pierrard et al. 
[14] presented a framework to loealize prominent faeial skin 
irregularities, like moles and birthmarks. They use a multi- 
seale template matehing algorithm for faee reeognition. A 
diseriminative faetor is eomputed for eaeh point by using skin 
segmentation and loeal salieney measure and is used to filter 
points. 

Reeently, Zhang et al [15] designed a faeial skin mark 
mateher based on a region growing algorithm. Eaeh faeial 
mark is deseribed in terms of position, eolor intensity and size. 
The results of the faeial skin mark mateher are fused with the 
results of a PCA based mateher to evaluate the performanee of 
the system. A drawbaek of the method is that they need to have 
prior knowledge about the loeation of faeial marks in order to 
apply the region growing algorithm. Park et al. [16] proposed 
to use faeial marks as soft biometries. They initially map eaeh 
faee image eontour obtained from an Aetive Appearanee Model 
(AAM) into a mean shape using baryeentrie texture mapping. 
The mean shape images are then filtered using a Laplaeian of 
Gaussian filter. This aets as a blob deteetor. Onee the faeial 
marks are deteeted, matehing is performed based on Euelidean 
distanee with a set threshold. The number of matehes represents 
the similarity seore between images. However, they do not 
use faeial marks by themselves to evaluate performanee. They 
fuse the seores from the faeial mark mateher and a eommereial 
faee reeognition software to evaluate the performanee. They 
observed marginal improvement in performanee by fusing the 
two seores. 

We have not eompared the proposed method with any other 
previously published methods mainly beeause previous ap- 
proaehes fuse different faee features with the features obtained 
from faeial marks. The work proposed in this paper uses only 
faeial marks as biometrie signatures to distinguish between in¬ 
dividuals. Most previously published work use low-resolution 
images. The number of faeial marks deteeted in low-resolution 


images is generally lower than the number of faeial marks 
deteeted in high-resolution images. Sinee a higher number of 
faeial marks are deteeted using high-resolution images, we are 
able to represent an individual by a larger and more unique 
feature set. Therefore, the performanee of the low-resolution 
approaehes is not direetly eomparable to the proposed ap- 
proaeh. 

B. Related Work on Identical Twin Biometrics 

Reeently researehers have started to look at the ehallenges in¬ 
volved in dealing with the task of distinguishing between iden¬ 
tieal twins. Kong et al. [17] observed that palm prints from iden¬ 
tieal twins have eorrelated features (though they were able to 
distinguish between them based on other nongenetie informa¬ 
tion). The same observation was made by Jain et al. [4] for fin¬ 
gerprints. They observed that though fingerprints appear to be 
more similar for identieal twins than unrelated persons, finger¬ 
print matehing systems ean distinguish between them. Geneti- 
eally identieal irises were eompared by Daugman and Downing 
[5] and were found to be as uneorrelated as the patterns of irises 
from unrelated persons. Kodate et al. [18] experimented with 
ten sets of identieal twins using a 2-D faee reeognition system. 

Reeently, Sun et al. [1] presented a study of distinetiveness 
of biometrie eharaeteristies in identieal twins using fingerprint, 
faee and iris biometries. They observed that though iris and 
fingerprints show little to no degradation in performanee when 
dealing with identieal twins, faee matehers experieneed prob¬ 
lems in distinguishing between identieal twins. All of these 
studies were either eondueted on very small twin biometrie 
datasets or evaluated using existing inhouse or eommereial 
matehers. Phillips et al. [3] presented the first detailed study 
on diserimination of identieal twins using different faee reeog¬ 
nition algorithms. They eompared three different eommereial 
faee reeognition algorithms on the identieal twins dataset 
aequired at Twins Day festival in Twinsburg, Ohio. The dataset 
eonsists of images aequired under varying eonditions sueh as 
faeial pose, illumination, faeial expression, ete. They observed 
that it is easier to distinguish between identieal twins under 
eontrolled studio-like settings than under uneontrolled settings. 

III. Types of Facial Marks 

A faeial mark is defined as a region of skin or superfieial 
growth that does not resemble the skin in the surrounding area. 
Faeial marks represent finer details on the faee. They eontain in¬ 
formation useful to diseriminate between identieal twins. Avail¬ 
ability of high resolution images enables us to view faeial marks 
in greater detail for analysis. We have identified and defined the 
following faeial marks (shown in Fig. 4), 

1) Mole: A small flat spot less than 1 em in diameter. The 
eolor of a mole is not the same as the nearby skin. It appears 
in a variety of shapes and is normally blaek in eolor. 

2) Freckle: A small flat spot less than 1 em in diameter and 
appears in a variety of shapes. It is usually brown in eolor. 

3) Freckle group: A cluster of freckles. 

4) Lightened patch: A flat spot that is more than 1 cm in 
diameter and appears in different shapes. It is lighter in 
color than its surroundings. 



SRINIVAS et al: ANALYSIS OF FACIAL MARKS TO DISTINGUISH BETWEEN IDENTICAL TWINS 


1539 


Mole 



Freckle and 




Raised Skin 


Scar (Round) 


Pockmark 


Acne 



Fig. 4. The different categories of facial marks defined. 


5) Darkened patch: A flat spot that is more than 1 cm in 
diameter and appears in different shapes. These spots are 
darker in color than their surroundings. 

6) Birthmark: A persistent visible mark on the skin that is 
evident at birth or shortly thereafter. Birthmarks are gener¬ 
ally pink, red, or brown in color. 

7) Splotchiness: An irregularly shaped spot, stain, or colored 
or discolored area. 

8) Raised skin: A solid, raised mark less than 1 cm across. 
It has a rough texture and appears red, pink, or brown in 
color. 

9) Scar: Discolored tissue that permanently replaces normal 
skin after destruction of the epidermis. 

10) Pockmark: A hollow area or small indentation. 

11) Pimple: A raised lesion that is temporary in nature. 


IV. DATA 

The dataset consists of face images of identical twins ac¬ 
quired in two days of August, 2009 at the Twins Days Fes¬ 
tival in Twinsburg, Ohio [3]. Face images were captured under 
different scenarios and conditions like controlled and uncon¬ 
trolled lighting, presence and absence of eyeglasses, different 
facial expressions like smile or neutral, different poses with yaw 
ranging from —90 to 90 degrees, where 0 degrees is a frontal 
view. The dataset used for the proposed experiments consists 
of only frontal (yaw = 0) face images with no glasses, no fa¬ 
cial hair and a neutral expression. These images were captured 
under controlled lighting. The 2009 dataset consists of 477 im¬ 
ages corresponding to 178 subjects and 89 pairs of twins. Fig. 1 
shows images of a set of identical twins from the dataset. The 
guidelines used to capture facial images at the Twins Day Fes¬ 
tival match the requirements defined by SAP level 51 [19]. The 
resolution of the images is 4310 x 2868 {w x h). The average 
interpupillary distance is 567 pixels. These high-resolution im¬ 
ages enable us to observe finer details on the face when com¬ 
pared to low-resolution images. 

V. Manual Annotations 

To evaluate the usefulness of the facial marks, the twins 
dataset was annotated initially by multiple observers. Manual 
annotations enable us to gain insight on the different categories 
and number of facial marks present in the dataset, the location 


of facial marks and an understanding of how human observers 
visualize and differentiate between facial marks. The manual 
annotation process is accomplished using Markit, a facial 
annotation tool developed at our laboratory by Matthew Pruitt. 
Markit was designed to aid users to manually annotate images, 
and has three main components. 

1) Display component: The images are displayed so that the 
user may observe and annotate the various facial marks. 

2) Annotation component: Contains a list of predefined facial 
marks for annotations. 

3) Tools component: Presents different shapes of bounding 
boxes to perform the actual annotations. 

The metadata produced by Markit for each image consists of 
the number, types and locations of annotated facial marks. We 
conducted two sessions of manual annotations with a time lapse 
of six months. In each session, a different set of images of the 
identical twins dataset was annotated. In the first session 275 
images were annotated and in the second session 202 images 
were annotated for a total of 477 annotated images. 

Four observers, denoted 1 through 4, annotated the images in 
two sessions. Observers 1, 2 and 3 did the first session and 1, 
2, and 4 did the second session. The observers had no prior ex¬ 
perience with facial mark annotation. Hence the observers were 
provided with the definitions of the facial marks characterized in 
this investigation along with example markings on a few sample 
images. Fig. 7 shows examples of different facial marks man¬ 
ually annotated by observer 1, observer 2 and observer 3. We 
observe that there is a difference in number of facial marks an¬ 
notated and categorized between observers. Figs. 5 and 6 show 
the category-wise distribution of the number of facial marks an¬ 
notated by each observer in session 1 and session 2. Table I indi¬ 
cates the total number of facial marks annotated by observers in 
session 1 and session 2. Based on these gross statistics, there is 
large variation in how the four observers perceived facial marks 
in the dataset. 

Difficulties in manual annotations noted by observers and us 
include: 

1) Manual annotation is a difficult task because it involves 
training and familiarizing an individual with the definitions 
and characteristics of the different types of facial marks 
(apart from learning how to use the facial annotation tool). 

2) Observers experienced difficulty in differentiating between 
categories of facial marks, especially in the case of moles 
and freckles. 

3) Although all observers appear to have annotated the promi¬ 
nent facial marks, some observers failed to annotate the 
less prominent but visible marks. 

4) It is a time consuming and an expensive task. 

Hence there is a need for an automatic facial mark detector 
to overcome these problems. Such a method is discussed in 
Section VII. 

VI. Facial Marks Based Matching 

We propose a matching approach that characterizes each face 
image based on the corresponding mark locations between the 
gallery and probe images. Each facial mark is characterized by a 
facial mark category and geometric location on the face image. 
Mark locations are transformed from image space to normalized 
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Fig. 5. Variation in number of facial marks of different types identified by each 
observer during session 1. 


avoid correspondence between facial marks of different cat¬ 
egories, edges connecting marks belonging to different cate¬ 
gories are given infinite weight. A potential match is determined 
if the Euclidean distance between a pair of feature centroids (be¬ 
longing to same category) is less than a threshold A. The op¬ 
timal correspondences are then established by executing stan¬ 
dard Hungarian bipartite graph matching algorithm [21] on the 
set of potential matches. The normalized similarity score Sij for 
the comparison between image li and image Ij is given by 

_ (1) 

max(Aj,7Vj)’ 

where M is the total number of correspondences established 
across the two images being compared. The similarity metric 
is used as a match score between two images. 
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Fig. 6. Variation in number of facial marks of different types identified by each 
observer during session 2. 


barycentric coordinate space. Barycentric coordinates are nor¬ 
malized homogeneous coordinates that are scale, rotation and 
translation invariant. They are computed with respect to a ref¬ 
erence triangle that forms the basis for the transformation. In 
this work, the reference triangle is determined by the center of 
each eye and the tip of the nose, which is automatically localized 
using STASM [20]. Given the reference triangle, points inside 
and outside the reference triangle can be expressed as a function 
of the vertices of the triangle. Since barycentric coordinates are 
computed using three vertices, it provides robustness to facial 
pitch changes. 

A. Bipartite Graph Matching 

Similarity between two sets of facial marks is computed by 
representing the matching problem using a weighted bipartite 
graph. A weighted bipartite graph is defined as G = {S.T : 
E), where S and T denote two disjoint sets of vertices and E 
denotes the connecting edges with corresponding nonnegative 
cost. A bipartite graph with 7Vi -h N 2 nodes corresponding to 
Ni facial marks in image Ii and N 2 facial marks in image I 2 
is constructed. The edges correspond to potential matches be¬ 
tween facial marks in Ii and I 2 . The nonnegative weight associ¬ 
ated with each edge is a function of Euclidean distance between 
normalized feature locations being compared. Facial marks of 
the same category should be compared against each other (e.g., 
mole against mole, freckle against freckle, etc.). Therefore, to 


B. Preliminary Results: Manual Annotations 

We perform extensive experiments on the manually annotated 
dataset. Each experiment is designed and implemented such that 
it evaluates the usefulness of facial marks as a biometric signa¬ 
ture to distinguish between identical twins. The outcome of each 
of the experiments provide answers to the following questions, 

1) How do facial marks annotated by a single observer distin¬ 
guish between identical twins? 

2) Is performance consistent when comparing annotations 
from different observers against each other? 

3) Is performance consistent when comparing annotations 
from the same observer in different sessions? 

The different experiments are designed as follows: 

1) Experiment 1 

a) Experiment Setup: Compare the query set against the 
target set annotated by the same observer in the same 
session. 

b) Target Set: Session n Data, where n represents the 
session number and n = 1 or 2. 

c) Query Set: Session n Data 

d) Inference: Facial marks can be used as potential 
biometric signatures to distinguish between identical 
twins. 

2) Experiment 2 

a) Experiment Setup: Compare the query set against the 
target set annotated by the same observer in different 
sessions 

b) Target Set: Session 1 Data 

c) Query Set: Session 2 Data 

d) Inference: Performance degrades due to inconsis¬ 
tency in annotations made by an observer across 
sessions. 

3) Experiment 3 

a) Experiment Setup: Compare the query set annotated 
by one observer against the target set annotated by an¬ 
other observer. For example, a query set annotated by 
observer 1 is compared against a target set annotated 
by observer 2, and the process is known as observer 
1 versus observer 2. 

b) Target Set: Session n Data, where n represents the 
session number and n = 1 or 2. 

c) Query Set: Session n Data 
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Fig. 7. Different facial marks manually annotated by each observer in session 1 for a given image, (a) observer 1; (b) observer 2; (c) observer 3. 


TABLE I 

Total Number of Facial Marks Annotated by Each Observer During 
Session 1 and Session 2 of Manual Annotations 


Observer ID 

Number of Facial Marks Annotated 


Session 1 

Session 2 

observer 1 

4879 

2526 

observer 2 

2208 

1684 

observer 3 

3606 

N/A 

observer 4 

N/A 

2079 


d) Inference: Performance degrades due to incon¬ 
sistency in perceiving facial marks by different 
annotators. 

Fig. 8 compares the Receiver Operating Characteristic 
(ROC) curves for Experiments 1, 2, and 3. We observe that 
the best performance is obtained when comparing annotations 
made by a single observer in the same session, i.e., Experiment 
1. This indicates that facial marks are useful in distinguishing 
between identical twins. However, performance curves ob¬ 
tained for Experiments 2 and 3 exhibit a significant degradation 
in performance, indicating that individual observers perceive 
facial marks differently over time and the annotations are 
inconsistent. Similarly, different observers view facial marks 
differently, leading to inconsistency. Also, the annotation style 
of an observer varies differently over time. Inconsistency in 
performance observed in Experiment 2 and Experiment 3 is a 
major drawback of using manually annotated facial marks to 
differentiate between identical twins. Hence, in order to obtain 
consistency, there is a need for a robust and efficient automatic 
facial mark detector. In the following sections we present such 
a multiscale automatic facial mark detector. 

VII. Automatic Facial Mark Detection 

Fig. 9 presents a detailed overview of the proposed auto¬ 
matic facial mark detector. Facial mark detection is performed 
at different scales, which is achieved by constructing a Gaussian 
pyramid [22]. For each image in the pyramid the following steps 
are applied: 



Fig. 8. Performance curves obtained for the different types of experiments ex¬ 
ecuted on the manually annotated dataset. 


1) The primary facial features like eyes, eyebrows, lips and 
nostrils are localized using an Active Shape Model (ASM). 

2) Individual masks are created based on the output of the 
ASM to mask the primary features. 

3) A gradient-based interest operator (the fast radial sym¬ 
metry transform) is applied to the masked image. The 
transform detects regions of high radial symmetry. 

4) Applying a threshold to the output of the fast radial sym¬ 
metry transform results in detecting bright or dark regions 
of high radial symmetry, which corresponds to potential fa¬ 
cial marks. 

The potential facial marks are detected at each level of the image 
pyramid. Those present in two or more levels are considered 
to be reliable facial marks. Presently, we do not categorize the 
detections into different types; we just treat them as point fea¬ 
tures. In the following sections we describe the process of image 
pyramid construction, localization of primary facial features, 
mask generation, and the fast radial symmetry transform. 
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Fig. 9. Diagram highlighting the main steps of the proposed multiscale automatic facial marks detection process. 


A. Gaussian Pyramid Construction 

The objective is to detect facial marks that are stable across 
different scales. This can be achieved by using a Gaussian 
pyramid. The Gaussian pyramid consists of a set of low-pass 
filtered and subsampled images. The original image is defined 
at the base level. The successive levels of the pyramid are 
obtained by filtering the image in the previous level and down- 
sampling it by a factor 2. Gaussian pyramid [22] is defined by 

GQ{x,y) = I{x,y)ioT:l = {) (2) 

2 2 

Gi{x,y) = E E w{m^ n)Gi-i{2x + m, 2y + n), 

m= —2 n= —2 

( 3 ) 

where Go{x, y) is the base image of size N x N and Gi{x, y) 
represents the images in the subsequent levels, I is the level 
number and w{m,n) is a Gaussian filter of size 5x5. 
The number of levels in a Gaussian pyramid is defined by 
I = [log 2 . In this study we define 1 = 5. Facial marks are 
detected at each level and then tracked across levels to signify 
their prominence. 

B. Detection of Primary Facial Features 

The contours of primary facial features like eyes, eyebrows, 
nostrils and lips are detected using an Active Shape Model. 
Primary facial features must be masked before the detection 
process to avoid detections that are caused by their presence. 
The Active Shape Model was first presented by Cootes et al. 
[10]. Once an ASM is trained based on the data found in the 
training set, it iteratively deforms a contour to fit the new image. 


The ASM defines two components of an object, the model 
shape and the profile. Model shape defines the shape of the con¬ 
tour. The profile is defined for each contour point and describes 
what the image looks like around each point in the model. We 
use an open source implementation of ASM called STASM 
[20]. This detects 68 facial landmark points corresponding 
to the contours of the primary facial features. Using these 
landmark points, a mask is created for each image to mask out 
the primary facial features referred to as masked images. 

C. Fast Radial Symmetry Detector 

We propose to apply the fast radial symmetry transform, to 
detect the desired facial marks. FRST was defined by Loy et al. 
[9]; it is similar to a generalized symmetry transform [23] and 
the circular Hough transform [24]. The output of the transform 
highlights radially symmetrical regions and suppresses regions 
that are asymmetrical. For a given image /, FRST determines 
the contribution of each pixel p to the symmetry over a set of 
radii n e N. The radii set N is defined based on the size of 
the different facial marks that appear on the face. Also, at each 
image level a different set is defined. 

Fig. 9 illustrates the multiscale automatic detection process. 
Initially, images are represented at different scales. Next for 
each image, the gradient image, g is computed using a 3 x 3 
Sobel operator. We compute the orientation projection image 
On and the magnitude projection image Mn at each point p of 
the gradient image g for every value in N. These images are de¬ 
termined by observing the gradient g at each point p from which 
a corresponding positively affected pixels and negatively-af¬ 
fected pixels are determined. Positively affected pixels are de¬ 
fined as pixels that lie along the direction of the gradient vector 
g and negatively-affected pixels are pixels in the direction di¬ 
rectly opposite to the gradient vector g as shown in Fig. 10. The 
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coordinates of the positively-affected and negatively affected 
pixels are defined as 


p+(p) = p + round^|^n 

( \ A g(P) 

P-(P) = P - round n 

g P 


( 4 ) 

( 5 ) 


where g(p) is the gradient vector and ||g(p)|| is the magni¬ 
tude of the gradient vector. Initially the orientation projection 
image and the magnitude projeetion image are zero. 
The points in and eorresponding to a pair of posi- 
tively-affeeted or negatively-affeeted pixels is eomputed by 


On(P+(p)) = 0„(p+(p)) + 1 (6) 

0„(p_(p)) = 0„(p_(p)) - 1 (7) 

M„(p+(p)) = M„(p+(p)) + ||g(p)|| (8) 

M„(p_(p)) = M„(p_(p)) - ||g(p)|| (9) 


The orientation projeetion image and magnitude projeetion 
image eapture the radial features of a partieular region. The 
radial symmetry eontribution S^, at radius n is given by 




( 10 ) 


where is given by 


Fn(p)= 


M„(p) ( |6„(p)| 


hn 


hn 


( 11 ) 


6„(p) is 




(p) if 0„(p) < k„ 
otherwise 


( 12 ) 


where is the Gaussian kernel, a is the degree of radial striet- 
ness and kn is the normalizing faetor. The final symmetry image 
S is formed by averaging the radial symmetry distributions over 
all radii n G A as 




(13) 


n^N 


After the radial symmetry image is eomputed, we apply hys¬ 
teresis thresholding to deteet dark or bright regions of high ra¬ 
dial symmetry resulting in a binary image B{x,y). 

Then we apply eonneeted eomponent analysis to deteet eon- 
neeted regions whieh eorrespond to potential faeial marks. For 
eaeh image, we earry out the above proeedure at different levels 
of the pyramid, henee we have a set of potential faeial marks 
at eaeh level. Next, we traek the potential faeial marks aeross 
levels to form the final set of deteetions. Deteetions present at 
two or more levels are ineluded in the final deteetion set. Onee 
the final deteetion set is ealeulated, we perform postproeessing 
to reduee the number of false positives. 

False positives oeeur mainly due to presenee of hair on the 
forehead. These are eliminated by using dominant orientation 
information obtained from loeal gradients of the deteeted re¬ 
gions [25]. The dominant orientation information for eaeh de¬ 
teeted region is eomputed by ealeulating the Singular Value De- 



Fig. 10. Positively and negatively affeeted pixels influeneed by point p of the 
gradient image g for n = 2. 


eomposition (SVD) of the gradient matrix G of the region. The 
region gradient matrix is defined as [26]: 


G = 


9x{k) 9y{k) 


(14) 


and the loeal dominant orientation information is obtained by 
eomputing the SVD of G, 


G = USV^ = U 



V2f 


(15) 


where U is an orthonormal to V. The singular values si and 52 
deseribe the energy in the direetion of veetors vi and V 2 . If the 
energy in the direetion of veetor vi > A^; for a given region, 
then we eonsider that region as a hair region and is eliminated. 
Also, if the size of the deteetions are greater than a predefined 
value St then the deteetions are eliminated. The remainder of 
the deteetions are eharaeterized as faeial marks. In Fig. 11, the 
deteetions represented by the square markers represent the po¬ 
tential faeial marks deteeted (before postproeessing). The de¬ 
teetions represented by the eireular marker are eliminated using 
SVD and the deteetions represented by the square marker indi- 
eate the true positives. The different eolors indieate the number 
of seales at whieh the faeial marks were deteeted. Faeial marks 
deteeted are shown in Fig. 12. Currently, eaeh faeial mark is de¬ 
fined based only on the geometrie loeation and we do not elas- 
sify them into different eategories. They are treated as point fea¬ 
tures. Onee the point features are eomputed we perform faeial 
mark matehing. 


D. Bipartite Graph Matching for Automatically Detected 
Facial Marks 

The proeess for matehing faeial marks deteeted by the multi- 
seale automatie faeial mark deteetor is similar to that deseribed 
in Seetion VI. In the ease of automatieally deteeted faeial marks, 
eaeh faeial mark is eharaeterized only by its geometrie loea¬ 
tion on the eorresponding faee image. Therefore, automatieally 
deteeted faeial marks are treated as point features and ean be 
viewed as they all belong to the same eategory. Loeation of fa¬ 
eial marks are eompared aeross images in baryeentrie eoordi- 
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(a) (b) 


Fig. 11. (a) Potential facial marks detected by the multiscale automatic facial mark detector. The different colors indicate the number of scales at which the facial 
marks are detected. Red is two levels, blue is three levels, green is four levels, and yellow is five levels, (b) Detections represented by circles represent potential 
false positives and are removed using gradient-based SVD. 


Input image of Twin A and Twin 
B 




Fig. 12. Facial marks detected by the multiscale automatic facial mark detector for a pair of twins. The different colors indicate the number of scales at which the 
facial marks are detected. Red is two levels, blue is three levels, green is four levels, and yellow is five levels. 


nate system as described in Section VI. Similarity is computed 
by formulating the matching process in terms of a weighted bi¬ 
partite graph. The normalized similarity between two images i 
and j is defined as 


V _ 2 _ 

^ maxfVj. Nj) 


(16) 


where the optimal matches are represented by a set 
M = .. .mj, and m/c = {pk ^ Ni,qk G N 2 ), 


and Wp^ and Wq^ are weighting factors that correspond to the 
number of levels at which the facial marks are detected. 


VIII. Experimental Setup and Results 

We perform extensive experiments on both the manually 
annotated facial marks and the automatically detected facial 
marks. In addition to the experiments mentioned in Section VI 
(Experiment 1, Experiment 2 and Experiment 3), we include 
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Fig. 13. Twins versus Twins and All versus All scenario used for comparing the target set and the query set. 


Experiment 4 and Experiment 5, whieh evaluate the perfor- 
manee of multiseale automatie faeial mark deteetor. These new 
experiments are deseribed as follows, 

1) Experiment 4 

a) Experiment Setup: Mateh the faeial marks deteeted 
by the multiseale automatie faeial mark deteetor. The 
query set and the target set are eomposed of detee- 
tions from the automatie faeial mark deteetor from 
session-1 or session-2 dataset. 

b) Target Set: Deteetions from Session-n data, where n 
represents the session number and n = 1 or 2. 

e) Query Set: Deteetions from Session-n data 

d) Inferenee: Performanee is lower when eompared to 
Experiment 1 but it is more eonsistent with ehange in 
seenarios. 

2) Experiment 5 

a) Experiment Setup: Mateh the faeial marks deteeted 
by the multiseale automatie faeial mark deteetor. The 
query set and the target set are eomposed of deteetions 
from the automatie faeial mark deteetor from different 
sessions. 

b) Target Set: Deteetions from Session-1 Data 

e) Query Set: Deteetions from Session-2 Data 

d) Inferenee: Consisteney in performanee is observed 
when eomparing faeial marks aeross sessions, unlike 
the trend observed with manual deteetions in Experi¬ 
ment 2 and Experiment 3. 

For eaeh of the aforementioned experiments, we define two 
different seenarios (Twins versus Twins and All versus All) of 
eomparing the target set and the query set to generate perfor¬ 
manee eurves. Fig. 13 depiets both of these seenarios. The main 
differenee between these two seenarios lies in the generation of 
the impostor seores. For both seenarios the genuine seores are 
obtained by eomparing faee images of a subjeet in the query set 
against other images of the same subjeet in the target set. How¬ 
ever, for the Twins versus Twins seenario, the impostor seores 
are obtained by eomparing a faee image of a subjeet in the target 
set only against the images of the subjeet’s twin. The impostor 
seores for All versus All are obtained by eomparing a faee image 
of a subjeet in the target set against images of all other subjeets 
in the query set. 



Fig. 14. Distribution of moles, freckles, and acne across the first 50 subjects in 
the dataset, i.e., 25 pairs of twins. 


Finally, the last variation added only to Experiment 1 is that 
it is exeeuted for different subsets of faeial mark eategories pro¬ 
vided. Eaeh manually annotated faeial mark is eharaeterized 
by its geometrie loeation and a faeial mark eategory. Amongst 
the different eategories of faeial marks defined, moles, freekles 
and pimple are more eommon and prominent. Therefore we de¬ 
fine different subsets of faeial mark eategories to determine if 
some faeial mark eategories are more useful eompared to others. 
Fig. 14 shows the distribution of moles, freekles and pimple for 
the first 50 pairs of twins. The following subsets of faeial marks 
are also eonsidered: 

1) FMl = {moles, freckles, freckle group, birthmark, 
darkened patch, lightened patch, splotchiness, raised 
skin, pockmark, scar ronnd, scar linear} 

2) FM2 = {moles, freckles} 

3) FM3 = {moles, freckles, pimple} 

4) FM4 = {}; i.e., eaeh manually annotated faeial mark is 
eharaeterized only by its geometrie loeation and the faeial 
mark eategory is ignored. 

Sinee the automatieally deteeted faeial marks are eharaeter¬ 
ized only by the geometrie loeation, we do not have to exeeute 
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Fig. 15. ROC curves of Twins versus Twins comparison for various sets of 
facial marks annotated by observer 1 in session 1. 



Fig. 16. Performance of All versus All comparison for various sets of facial 
marks annotated by observer 1 in session 1. 


the experiments based on the subset of faeial marks. We use Re- 
eeiver Operating Charaeteristie (ROC) eurves and Equal Error 
Rate (EER) as performanee measures to evaluate the perfor- 
manee of the different experiments. If the ROC eurve is eloser 
to the upper-left eorner, it indieates a better performing system 
and eorrespondingly, a lower value of EER. 

A. Results: Experiment 1 

Figs. 15 and 17 present the ROCs for Experiment 1 for ses¬ 
sion 1 and session 2 dataset for Twins vs Twins eomparison 
using different subsets of faeial marks. Similarly, Figs. 16 and 
18 present the ROCs for Experiment 1 for session 1 and session 
2 dataset for All versus All eomparison using different subsets 
of faeial marks. Considering subsets of faeial marks does not 
signifieantly improve performanee. In faet, best performanee 
is aehieved by eonsidering all eategories of faeial marks. We 
observe that there is no signifieant degradation in performanee 



Fig. 17. ROC curves of Twins versus Twins comparison for various sets of 
facial marks annotated by observer 1 in session 2. 



Fig. 18. Performance of All versus All comparison for various sets of facial 
marks annotated by observer 1 in session 2. 


when faeial mark eategories are ignored. This indieates that geo- 
metrie loeation of faeial marks is a robust feature that ean be 
used to differentiate between identieal twins. A similar trend 
is observed in both seenarios. An interesting result is observed 
when eomparing the performanee between Twins versus Twins 
eomparisons and All versus All eomparisons. Performanee is 
signifieantly better when eomparing faeial marks aeross unre¬ 
lated individuals (All versus All seenario) with faeial marks 
aeross identieal twins. This leads to an inferenee that distribu¬ 
tion of faeial marks aeross identieal twins appears to be eorre- 
lated. 

The distributions of genuine and impostor seores for both ex¬ 
periments from session 1 are shown in Fig. 19. There exists a 
larger overlap between the mateh and nonmateh seores for the 
Twins versus Twins eomparison eompared to the All versus All 
eomparisons, leading to higher error and lower performanee. 
Henee, it is easier to differentiate unrelated persons using faeial 
mark distribution than identieal twins. Although results are pro¬ 
vided based on the faeial marks annotated by observer 1, similar 
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Fig. 19. Distributions of match and nonmatch scores for Twins versus Twins 
and All versus All comparisons. 



Fig. 20. Comparing the performance curves of observer 1 for both All versus 
All and Twins versus Twins comparisons, for Experiment 2. 


trends are seen in the ease of faeial marks annotated by other ob¬ 
servers. 

B. Results: Experiment 2 

In Experiment 2 we eompare manually annotated faeial 
marks aeross sessions. Only two observers, observer 1 and 
observer 2, took part in both the sessions. Henee, we ean 
eompare annotations made only by them. Fig. 20 presents the 
ROC eurves for both the Twins versus Twins and All versus 
All eomparison eonsidering all faeial marks (FM). Again, we 
observe that the performanee of All versus All outperforms 
the Twins versus Twins performanee. Ideally, the performanee 
should be similar to the results of Experiment 1. However, 
there is a degradation in overall performanee when eomparing 
manually deteeted faeial marks aeross sessions, indieating that 
observers are not eonsistent over time. Table II lists the EER 


TABLE II 

Equal Error Rates Computed for Twins versus Twins and Aee versus 
Aee Comparisons Across Sessions for Observer 1 and Observer 2 


Observers 

Type of Experiment 

EER- Across Session 

observer 1 

Twins vs. Twins 

29.88% 

observer 2 

Twins vs. Twins 

31.45% 

observer 1 

All vs. All 

29.88% 

observer 2 

All vs. All 

24.75% 



Fig. 21. Performance curves for Experiment 3 for Twins versus Twins and All 
versus All comparisons for observer 1 versus observer 3, session 1. 


eomputed for both twin and unrelated persons eomparison 
aeross sessions for observer 1 and observer 2. 

C. Results: Experiment 3 

In this experiment, similarity seores are eomputed by eom¬ 
paring a query set annotated by one observer to a target set an¬ 
notated by a different observer. Performanee eurves for these 
experiments for eaeh session are shown in Fig. 21 for observer 
1 versus observer 3 and Fig. 22 for observer 1 versus observer 
4, for both Twins versus Twins and All versus All seenarios. 
The performanee degrades eonsiderably when faeial marks an¬ 
notated by different observers are eompared against eaeh other. 
This oeeurs due to the variation of faeial mark annotations by 
observers. There is laek of uniformity in faeial mark annotations 
aeross observers. Though all observers annotated the promi¬ 
nently visible faeial marks, a few failed to annotate the less 
prominent marks. However, even in these experiments, the per¬ 
formanee of All versus All eomparisons is better than the Twins 
versus Twins suggesting similarity in the distribution of faeial 
marks aeross twins eompared to unrelated persons. 

D. Results: Experiment 4 and Experiment 5 

Experiment 4 and Experiment 5 evaluate the performanee of 
the proposed multiseale automatie faeial mark deteetor. A total 
of 4602 faeial marks were deteeted in the session 1 dataset and 
3012 faeial marks were deteeted in the session 2 dataset. Figs. 23 
and 24 shows the performanee eurves for Experiment 4 for both 
the Twins versus Twins and All versus All eomparisons for eaeh 
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Fig. 22. Performance curves for Experiment 3 for Twins versus Twins and All 
versus All comparisons for observer 1 versus observer 4, session 2. 



Fig. 23. ROC curves representing the performance of the proposed multiscale 
automatic facial mark detector for Experiment 4, for session 1 dataset, for both 
Twins versus Twins and All versus All comparison. 

of the dataset. The performanee of the proposed deteetor is rel¬ 
atively worse when eompared with the results obtained for Ex¬ 
periment 1, whieh uses manual annotations. The performanee 
eurves for Experiment 5 are shown in Fig. 25. Faeial marks de- 
teeted in session 2 dataset are eompared against faeial marks de- 
teeted in session 1 dataset. In theory, this is analogous to Exper¬ 
iment 2. We obtain similar results for both Twins versus Twins 
eomparisons and All versus All eomparisons in Experiment 4 
and Experiment 5. It does not exhibit a large variation in perfor¬ 
manee as seen in Experiment 2 and Experiment 3. 

Finally, we eompare the performanee between manually 
deteeted faeial marks and automatieally deteeted faeial marks. 
Fig. 26 eompares performanee aeross Experiment 1, Experi¬ 
ment 2, Experiment 4, and Experiment 5 for the Twins versus 
Twins eomparison. Although the best performanee is aehieved 
by eomparing faeial marks annotated by an observer at a given 
time, there is a large degradation in performanee when we 



Fig. 24. ROC curves representing the performance of the proposed multiscale 
automatic facial mark detector for Experiment 4, for session 2 dataset, for both 
Twins versus Twins and All versus All comparison. 



Fig. 25. ROC curves representing the performance of the proposed multiscale 
automatic facial mark detector for Experiment 5, for both Twin versus Twin and 
All versus All comparison. 


compare facial marks annotated by an observer over time, 
indicating inconsistency in performance. This variation in 
performance is not seen when comparing facial marks detected 
by the proposed multiscale automatic facial mark detector. 
The automatic facial mark detector performs better than facial 
marks annotated by an observer at different instances of time. 

Similarly, we observe that the performance of the automatic 
facial mark detector is better when compared with Experiment 
3, i.e., when we compare facial marks annotated by different ob¬ 
servers, as shown in Fig. 27. The performance of the proposed 
automatic facial mark detector is relatively consistent and uni¬ 
form. The same trend is observed in All versus All comparison. 

The results obtained can be summarized as follows, 

1) In every experiment, we observe that the All versus All 
comparison performs better than Twins versus Twins com¬ 
parison. This indicates that the facial mark distributions 
across identical twins appear to be correlated. 
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Fig. 26. Performance comparison between manually detected facial marks ex¬ 
periments (Experiment 1 and Experiment 2), and automatically detect facial 
mark experiment (Experiment 4 and Experiment 5). For Twins versus Twins 
comparisons. 



Fig. 27. Performance comparison between manually detected facial marks ex¬ 
periment (Experiment 3), and automatically detect facial mark experiment (Ex¬ 
periment 4). For Twins versus Twins comparisons. 

2) Manual annotation process is a difficult and time con¬ 
suming task, and hence there is a need for a robust 
automatic facial mark detector. 

3) Though the performance of the proposed multiscale auto¬ 
matic facial mark detector is slightly lower than the per¬ 
formance of manual annotations for a single session, it ex¬ 
hibits greater consistency over time. 

IX. Conclusion 

In this paper, we analyzed the usefulness of facial marks as 
a potential biometric signature for distinguishing between iden¬ 
tical twins. We proposed a multiscale automatic facial mark de¬ 
tection system for distinguishing between identical twins solely 
based on the geometric distribution of facial marks. Different 


experiments are designed and implemented to highlight the per¬ 
formance of the proposed automatic facial mark detector when 
compared with the performance achieved by using manually de¬ 
tected facial marks. From the results there appears to be a cor¬ 
relation in distribution of facial marks across twins. This phe¬ 
nomenon is observed across all experiments. In the future, we 
will explore using richer facial mark characteristics like texture, 
shape and color to improve performance. We hope to further ex¬ 
plore the use of different matching algorithms and compare it 
to the proposed matching algorithm. Facial marks features can 
be fused with other facial features to enrich facial characteriza¬ 
tions for improved performance. The results of the investigation 
makes a case for the use of facial marks in biometric character¬ 
ization. 
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